Install gate, Phase 0: vuln-api contract + test harness#110
Install gate, Phase 0: vuln-api contract + test harness#110juangaitanv wants to merge 8 commits into
Conversation
Running the suite from a git hook (e.g. pre-commit in a worktree) leaks GIT_DIR into the tests' subprocesses, pointing their git init at the developer's repo — locked mid-commit — instead of the temp dir.
The vuln-api client and its versioned contract (clean / vulnerable / malware / unknown verdicts, remediation data), harvested from the install-vuln-gate spike (dfac68e) and trimmed to phase scope: public unauthenticated lookups only, no retries, no user-facing command. - src/vuln_api: blocking client for /v1/packages/.../check with status mapping, identity guard, and PEP 503 request-time normalization - src/vuln_api_stub: in-process TCP stub, gated out of release builds via the test-stub feature + self dev-dependency - tests/common: shared GateHarness scaffold for later phases - tests/vuln_api_contract.rs: contract tests against the stub (hermetic) and the staging worker (#[ignore], deterministic targets documented in tests/fixtures/vuln_api/README.md)
…, staging CI Addresses Cursor review on #110. - vuln_api identity guard now applies the ecosystem's canonical-name rule to the response package_name before comparing, not just eq_ignore_ascii_case. A response echoing the stored spelling (`flask_cors` for a `flask-cors` request — PEP 503-equivalent) no longer trips the guard and fails the gate closed for valid pypi packages with `_`/`.` in their names. New unit test covers it. - tests/harness_smoke.rs exercises the GateHarness scaffold directly (fake package manager on PATH, registry stub, vuln-api stub) so the wiring can't silently regress before a later phase drives it end-to-end. - .github/workflows/staging-contract.yml runs the #[ignore]d staging contract tests on a daily schedule (non-blocking) so endpoint/schema/seed drift is caught out-of-band instead of shipping undetected.
cargo test runs the module's tests on parallel threads. Five of them bind ephemeral ports, and between port_is_available_reflects_current_port_usage's drop(listener) and its re-check, a concurrent :0 bind could be handed the just-freed port, flipping the second assert to false. Add a module-level PORT_TEST_LOCK that all five port-binding tests acquire. The async test scopes the guard to the synchronous reserve so no lock is held across .await (keeps clippy::await_holding_lock clean).
…riant docs, harness opt-out - pypi wire names now use the server's rule (lowercase + trim, worker.js normalizePackageName), NOT PEP 503: collapsing zope.interface to zope-interface missed the stored advisory row and read vulnerable dotted/underscored packages as clean. PEP 503 remains the identity-comparison rule (Ecosystem::request_name vs normalize_name), and the stub's key() now mirrors the server so the divergence can't be masked in tests again. - Module/auth docs no longer claim lookups are permanently unauthenticated: production /check requires a Corgea token (staging runs VULN_API_REQUIRE_AUTH=false); token wiring lands with authenticated mode. Test renamed public_check_sends_no_auth_headers. - GateHarness::without_vuln_api() opt-out for no-endpoint tests. - utils/api.rs get_source() delegates to the cached vuln_api::source().
…coding, clippy --all-targets - Reject contradictory verdicts in both directions (is_vulnerable must agree with matches presence); false+non-empty was the dangerous false-negative. - VulnMatch.tier: u8 -> Option<u8> so a null/missing tier no longer fails the whole response (server emits row.tier unclamped on /check). - encode_npm_name percent-encodes every component (scoped + unscoped); output is identical for valid names, robust against reserved chars (matches pypi). - Correct two factually-wrong comments (npm casing; 'staging spells PyPI') against the real server (worker.js echoes ecosystem/name/version verbatim). - source() -> &'static str and validated_base() -> &str: drop per-request clone / per-call alloc on the per-package hot path. - Harness strict clippy now --all-targets (lints tests + the test-stub module); cleared the two doc_lazy_continuation warnings it surfaced.
| std::env::var("CORGEA_SOURCE").unwrap_or_else(|_| "cli".to_string()) | ||
| fn get_source() -> &'static str { | ||
| // One definition of the CORGEA-SOURCE value (cached there). | ||
| corgea::vuln_api::source() |
There was a problem hiding this comment.
Instead of create a separate client Its better to refactor this to optionally include authorisation headers so that any debugging logic and error handling is centralised in one place
There was a problem hiding this comment.
agree the debug logging can be centralized, and I'll pull that into a shared helper as a follow-up. I'd keep the clients separate though: the vuln-api host is user-configurable and the shared client is auth/cookie/redirect-enabled by construction (cookies and redirects are fixed at build time, so optional auth alone won't cover it), so merging would add footguns to the auth-bearing client for ~12 lines of setup.
Ibrahimrahhal
left a comment
There was a problem hiding this comment.
How're we going to handle the auth experience to this new API, are going to have two login commands?
no auth required to reduce the adoption friction, for now. |
Record why the vuln-api client is deliberately separate from the shared CLI client (user-configurable host must not replay Corgea credentials), point at the enforcing test, and drop the play-by-play of utils::api internals so the comment can't go stale when that module changes.
Overview
This PR starts the install-gate feature for the Corgea CLI.
The install gate will eventually check npm and PyPI package versions against Corgea's vulnerability API before install flows proceed, so vulnerable or malicious versions can be blocked before they reach the developer's environment.
This is the first PR in a stacked restart of the install-gate work. The previous attempt built the right shape, but it was too large to review as one change. This restart lands one phase per PR, each with explicit exit criteria.
This PR is Phase 0: foundation only. No user-facing command ships here, and no package install is blocked yet.
What Phase 0 Includes
Phase 0 adds the client, contract, and test scaffold that later install-gate phases will build on:
src/vuln_api/- blocking client forGET /v1/packages/{eco}/{name}/versions/{ver}/check. It is independent of the shared CLI HTTP client because the vuln-api host is user-configurable viaCORGEA_VULN_API_URL, so it must not replay Corgea cookies or redirects. It includes status mapping, a confused-deputy identity guard, request-time PyPI name normalization, and fixed request headers.src/vuln_api_stub/- minimal in-process TCP stub for tests, gated out of release builds behind thetest-stubcargo feature. There is no standalone binary.tests/common/-GateHarness, the shared integration-test scaffold for later phases. It provides an isolatedcorgea, a private PATH of fake package managers, registry stubs, and vuln-api stubs.tests/fixtures/vuln_api/- committed contract response bodies for clean, vulnerable, malware, and unknown package checks.tests/vuln_api_contract.rs- contract tests against the hermetic stub plus ignored staging-worker tests for the live staging endpoint.tests/harness_smoke.rs- smoke coverage proving the harness wires the fake package manager, registry stub, and vuln-api stub..github/workflows/staging-contract.yml- scheduled non-blocking staging contract checks so endpoint, schema, or seed-data drift is caught out of band.Deliberately Out Of Scope
Later phases will add:
Exit Criteria - Met
cargo test../harness checkpasses.Also Included
A one-line prerequisite fix (
0e5beb4):tests/cli_deps.rs's git helper now scrubs inheritedGIT_*env. Running the suite from the pre-commit hook in a worktree leakedGIT_DIRinto the tests' subprocesses, pointing theirgit initat the developer's repo instead of the temp dir. Required for the harness to pass cleanly under the commit hook.Review Notes
https://cve-worker-staging.corgea.workers.dev) is the current default endpoint (DEFAULT_VULN_API_URL); the production-worker handoff and seed-data ownership are open questions for later phases, not blockers for Phase 0.